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ABSTRACT 

We present a new method for analyzing multi-detector maps containing contributions 
from several components. Our method, based on matching the data to a model in 
the spectral domain, permits to estimate jointly the spatial power spectra of the 
components and of the noise, as well as the mixing coefficients. It is of particular 
relevance for the analysis of millimeter-wave maps containing a contribution from 
CMB anisotropies. 

Key words: Cosmic microwave background - Cosmology: observations - Methods: 
data analysis 


1 INTRODUCTION 

Mapping sky emissions at millimeter wavelengths, and 
in particular Cosmic Microwave Background (CMB) 
anisotropies, is one of the main objectives of ongoing ob¬ 
servational effort in millimeter-wave astronomy. Sensitive 
balloon-borne and space-borne missions such as Archeops 
(Benoit et al 2002b), Boomerang (de Bernardis et al 2000), 
Maxima (Hanany et al. 2000) and MAP (Bennett et al. 
1997) are currently in operating status, yielding a large 
amount of multi-detector and multi-frequency measure¬ 
ments. Within a few years, the Planck mission (Lamarre 
et al. 2000; Bersanelli & Mandolesi 2000), to be launched 
by ESA in 2007, will observe the complete sky with ~ 100 
detectors distributed in nine frequency bands ranging from 
30 to 850 GHz. The main objective of these observations is 
the determination of the spatial power spectrum of CMB 
anisotropies. A secondary objective is identifying and map¬ 
ping the emission from all contributing astrophysical pro¬ 
cesses. 

The availability of several detectors operating in sev¬ 
eral bands makes it possible to devise new powerful data 
processing schemes. In particular, by combining data from 
several detectors, it is possible to improve substantially the 
signal-to-noise ratio (by weighted averaging) and to separate 
several foreground components (possibly of astrophysical in¬ 
terest in their own right) from the CMB by component sep¬ 
aration methods. Component separation, however, typically 
requires a good knowledge of the transfer function connect¬ 
ing a multi-component sky to multi-detector maps. 

This paper proposes to use spectral matching as a 
new approach to processing multi-detector multi-component 
(MDMC) data, in which all the information needed to esti¬ 


mate the spatial power spectra of components and/or to sep¬ 
arate them is sought in the data structure itself. The method 
works with or without prior detector calibration and gives 
access to spatial power spectra in a straightforward way; it 
is statistically efficient (being a maximum likelihood tech¬ 
nique) and computationally efficient (working with a small 
set of sufficient statistics rather than with original maps). 

This paper is organised as follows. The idea of spec¬ 
tral estimation via multi-detector multi-component spec¬ 
tral matching is introduced in section 2. Section 3 describes 
the technique in more detail, connects it to a maximum like¬ 
lihood method, and discusses the specific implementations. 
Section 4 is devoted to evaluating the performance of the 
method on synthetic Planck HFI observations. We discuss 
the method and extensions in section 5. 


2 THE MULTI-DETECTOR 

MULTI-COMPONENT FRAMEWORK 

Multi-detector CMB measurements can be modeled as re¬ 
sulting from the superposition of multiple components. Sta¬ 
tistically efficient data processing should coherently exploit 
this MDMC structure. 

The sky emission at millimeter wavelengths is well mod¬ 
eled at first order by a linear superposition of the emissions 
of a few processes: CMB anisotropies, thermal dust emis¬ 
sion, thermal Sunyaev Zel’dovich (SZ) effect, synchrotron 
emission, etc. The observation of the sky with detector d is 
then a noisy linear mixture of Nc components: 
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Nc 

yd(O,4>) = '^AdjSj{0,4>)+nd{O,4>) (1) 

3=-^ 

where Sj is the emission template for the jth astrophysical 
process, herein referred to as a source or a component The 
coefficients Adj reflect emission laws and detector properties 
while Ud accounts for noise. For simplicity, we neglect for the 
moment beam effects, postponing the discussion to section 5. 

Quantities of prime interest are spatial power spectra. 
For the j-th component, at frequency this is: 

Cj{i) = (\sF£)f) (2) 

where (•) denotes the expectation operator and i indexes 
either a Fourier mode or an (^, m) mode. 

In practice, power spectra are estimated by averages 
over bins: 

C3iQ) = CA£). ( 3 ) 

te'Dq 

where q = 1,..., Q is the spectral bin index, T>q is the set 
of frequencies contributing to bin q and Uq is the number of 
such frequencies.^ Typical bins can be bands imin < i < ^max 
extending over a range of one to tens of i values. 

Multi-detector power spectrum 

Since we focus on jointly processing the maps from all 
detectors, it is convenient to stack yi,..., into a single 
Nd X 1 vector Y. Then, the set of eqs. 1 for all Nd detectors 
is more compactly written in matrix-vector form as: 

Y{e,<i>) = ASi0,<i>) + Ni0,<t>) ( 4 ) 

with a so called Nd x Nc ‘mixing matrix’ A. In Fourier space, 
this equation reads 

Y(£) = AS(£)-tN(£). (5) 

The power spectrum of process Y is represented by the Nd x 
Nd spectral density matrix {Y{i)Y{iy) where denotes 
transpose-conjugation. Its average over bins 

= V Z (<7 = 1,.... Q) (6) 

will also be referred to as a spectral density matrix. Accord¬ 
ing to the linear model (1), it is structured as: 

RY{q) — ARs{q)A^-\-RN{q) (g=l,...,Q) (7) 

with Rs{q) and RN{q) defined similarly to Ryiq)- Statistical 
independence between components implies: 

Rs{q) = diag(C'i(g),.. .,CNciq)) ■ (8) 

For the sake of exposition, we assume that the noise is un¬ 
correlated, both across detectors and in space, so that the 
noise structure is described by Nd parameters: 

RN{q) = diag(af,... ,a%^) . (9) 

Parameter extraction by spectral matching 

The MDMC model, as defined by eqs. (7-8-9), depends 

^ It is customary for CMB data analysis to weight the terms 
in sum 3 by -h 1). For the sake of exposition, we use a flat 
weighting here (see section 5 for weighted sums) 


on a set {i^v(^)} of Q spectral density matrices, which in 
turn depend on {A, Cj (g), cr^)}, amounting to Nd x Nc -\- 
Q X Nc Nd scalar parameters. However, the number of 
independent correlations in Q spectral density matrices is 
Q X Nd(Nd + l)/2 (since each matrix is real symmetric). 
This later number is (in general) higher than the former. 

With this in mind, our proposal can be summarized as 
‘MDMC spectral matching’, meaning: estimate all (or parts 
of) the parameters {A, Cj(g), cr^)} by finding the best match 
between {Rviq)}, as specified by (7-8-9), and a set of Q 
‘empirical spectral density matrices’ {i?v(^)}: 

%{q) = —y^Y{£)Y{£)^ (g=l,...,Q) (10) 

ieT)q 

which are the natural non parametric estimates of the cor¬ 
responding Rviq). 

Some preliminary comments about the MDMC spectral 
matching approach are in order. 

Parameter choice: There is a lot of flexibility in the 
choice of parameters over which to minimize the spectral 
mismatch. By selecting different sets of parameters, differ¬ 
ent goals can be achieved. For instance, we may assume that 
matrix A and the noise spectrum i^w(^) are known so that 
the mismatch is minimized only with respect to the binned 
spectra Ci(q) of all components: the method appears as a 
spectral estimation technique which does not require the ex¬ 
plicit separation of the observed maps into component maps. 
Another important example, as illustrated in section 4, con¬ 
sists in including matrix A among the free parameters. Then, 
the method works as the so-called ‘blind techniques’, and 
permits the measurement of the emission law of the compo¬ 
nents, or the cross calibration of detectors. 

Degeneracies: A key issue in spectral matching is whether 
or not matrix A can be uniquely determined from the data 
only. When all parameters {A, Cj (g), cr^} are allowed to be 
adjusted, there are at least two clear indeterminations. First, 
the ordering (or numbering) of the components in the model 
is immaterial: matrix A cannot be recovered better than up 
to column permutation on the sole basis of a spectral match. 
Second, a scalar factor can be exchanged, for each compo¬ 
nent j, between the jth column of A and Cj{q). These scale 
factors cannot be determined from the data themselves. 

Another trivial case of indetermination is when two 
columns of A corresponding to physically distinct compo¬ 
nents are proportional. In this case, the sum of the two 
appears in the model as one single component. The iden- 
tffiability of the other components is not affected. 

A more severe degeneracy occurs if any two components 
have proportional spectra. In this case, as is known from the 
noiseless case (Pham & Garat 1997), only the space spanned 
by the corresponding columns of A can be determined in 
a spectral match with A as a free parameter. In this case 
however, the identifiability of the other components is un¬ 
changed, with no impact on the accuracy of component sep¬ 
aration with a Wiener method (sec. 5). The key point to re¬ 
member is that spectral matching requires spectral diversity 
to separate components associated with unknown columns 
of A. 

Maximum likelihood: Section 3.1 explains why ‘spectral 
matching’ corresponds to maximum likelihood estimation. 
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This happens in a Gaussian stationary model with smooth 
(actually: constant over bins) spectra. In such a model the 
likelihood of the observations is a measure (12) of spectral 
matching. Since the likelihood then depends on the data 
only via the empirical spectral density matrices, the mas¬ 
sive data reduction gained from replacing the observations 
by a (usually) much smaller set of statistics (the empirical 
spectral density matrices Ryiq)) is obtained without infor¬ 
mation loss. 

Comparison with component separation: It is inter¬ 
esting to compare spectral matching to techniques based on 
prior explicit component separation. 

Producing a CMB map as free as possible from fore¬ 
ground and noise contamination is the objective of the com¬ 
ponent separation step, in which maps obtained at different 
frequencies are combined to maximize the signal to noise 
ratio (where noise includes also foreground contamination). 

The usual approach for taking advantage of multi¬ 
detector measurements can be summarised as: first, form 
estimates Sj(^) of component maps Sj(£) (via component 
separation), second, estimate the spectrum of each compo¬ 
nent j by averaging within bins: 

= ;r E (11) 

Tiq ‘ 

ieT)q 

with, possibly, some post-processing of the power spectrum 
estimates. 

This method suffers from two difficulties. First, the 
best component separation methods typically require the 
prior knowledge of the statistical properties of the compo¬ 
nents (including the CMB power spectrum) and of the noise. 
Second, recovered maps contain residuals (including noise) 
which contribute to the total power, biasing the spectrum 
estimated on the map, unless the power spectrum of these 
residuals can be estimated accurately and subtracted for de¬ 
biasing. 

In contrast our approach takes the reverse path. The 
first step is the estimation of the spectrum for the multi¬ 
detector map (which takes the form of a sequence of spectral 
density matrices). This first step preserves all the joint corre¬ 
lation structure between maps. In essence, the second step 
(spectral matching) amounts to resolving the joint power 
spectrum into spectra of individual components. 

Hence, instead of first separating component maps and 
then computing power spectra, we first compute the multi¬ 
variate power spectrum and then separate component spec¬ 
tra. 


3 MDMC SPECTRAL MATCHING IN 
PRACTICE 

The implementation of MDMC spectral matching is now 
described in more detail. Section 3.1 introduces the spectral 
matching criterion; section 3.2 describes the EM algorithm 
for its optimization; section 3.3 describes a complementary 
technique for fast convergence. 


3.1 Maximum likelihood spectral matching 

Any reasonable measure of mismatch between the empiri¬ 
cal density matrices {Rv(g)} and their model counterparts 
{Rv(g;^)} could be used to compute estimates of a ^ pa¬ 
rameter. In order to get good estimates, however, one should 
use a mismatch criterion derived from statistical principles. 
Such a derivation can be based on the statistical distribu¬ 
tion of the Fourier coefficients of a stationary process which 
are (at least asymptotically in the data size) normally dis¬ 
tributed, uncorrelated, with a variance proportional to the 
power spectrum (Whittle approximation, see appendix B). 
Thus, the likelihood of the observations can be readily ex¬ 
pressed in terms of spectral density matrices. Appendix B 
outlines how the (negative) log-likelihood of the data then 
is (up to irrelevant factors and terms) equal to 

Q 

0(0) = ^n, D(fly(g),i?y(g;0)) (12) 

q=l 

where T)(-, •) is a measure of divergence between two positive 
n X n matrices defined by 

D(Ri, R 2 ) = tr (^RiR2^) — logdet(RiR^^) — n. (13) 

It can be seen^ that D(Ri, R 2 ) > 0 with equality if and only 
if Ri = R 2 . Thus spectral matching corresponds to maximum 
likelihood estimation in a stationary model The minimizer 
of ())(0) is then a maximum likelihood estimate, and inherits 
the good statistical properties associated to it. 

Only in an asymptotic framework can maximum like¬ 
lihood procedures be proved to reach minimum estimation 
variance. It means that criteria which are equivalent to (12) 
are expected to have the same statistical quality as (12). 
In particular, criterion (12) can be replaced by a quadratic 
approximation: when each Rviq) is close RY{q] ^), a second- 
order expansion of D{Ry^ Ry) yields 

D 2 ^RyiRy^ — tr ^Ry^ {Ry — Ry)Ry^{^y ~ • (14) 

The resulting quadratic criterion is of particular interest 
when the unknown parameters enter linearly in RY{q; 0) (for 
instance when A is known and 0 only contains the binned 
power spectra of the components) since then criterion mini¬ 
mization becomes trivial. In this paper, however, we stick 
to using (12-13). Even though the divergence (13) may, 
in the general case, seem more difficult to deal with than 
its quadratic approximation (14), it actually lends itself to 
simple optimization via the EM algorithm (see section 3.2) 
thanks to its connection to the likelihood. 

3.2 The EM algorithm 

The expectation-maximization (EM) algorithm (Dempster 
et al. 1977) is a well known technique for maximizing the 
likelihood of statistical models which include ‘latent’ or ‘un¬ 
observed’ variables. It is well suited to our purpose by taking 
the components as the latent variables. The EM algorithm 
is iterative: starting from an initial value of the parameters, 

^ For instance by expressing D(Ri, R 2 ) in terms of the eigenval- 
_ i _ i 
ues of 2 ^ R 1 R 2 ^ • 
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it performs a sequence of parameter updates called ‘EM- 
steps’. Each step is guaranteed to increase the likelihood of 
the parameters. 

The spectral matching criterion (12) actually being a 
likelihood function in disguise, the EM algorithm can be 
used for its minimization. Each EM step is guaranteed to 
improve the spectral fit by decreasing 0(^). 

We consider the regular EM algorithm, based on the 
Gaussian likelihood described in appendix B and taking as 
‘latent variables’ the spectral modes Y(£). The form of the 
EM steps immediately follows as sketched in appendix C 
and summarized by the pseudo-code. 


Require: Spectral density matrices Rv(l), • •., Ry{Q) 
Require: Initial value of ^ = ^A,Cj{q), . 

Set Ryy(q) = Rriq) and Ryy = Ryvig)- 

repeat 

{_E-step. Compute conditional statistics:} 

Set Rs(q) = diag(Cj(^)) and Rn = diag(cr^) 
for g = 1 to Q do 

Giq) = iA^R],^A + Rsiq)-^)-^ 

W{q) = G{q)YR-C 

Rss{q) = W{q)RY{q)W{q)^+ G{q) 

Rsy{q) = W{q)RY{q) 
for 

Rss = EU ^ RUq) 


Rys — n Ry^il) 


{- 


A - RysRss 

Ciiq) = 


. M-step. Update the parameters:} 




Ryy RsyRss Rsy 


Rescale the parameters (see text), 
until a convergence criterion is satisfied 


Algorithm 1: The EM algorithm for minimizing the 
MDMC spectral mismatch 0(^) with respect to ^ = 

{A,Cj{q),al}. 


It is worth mentionning that EM steps take such a regu¬ 
lar structure when the parameters are 0 = {A, Cj(q),a‘^}. A 
slightly different form would result from a more constrained 
parameter set. 

Recall that, as previously noted, there is a scale in¬ 
determination on each component’s spectrum when 0 = 
{A, Cj (g), cr^}. We have found that this inherent indetermi¬ 
nation must be explicitly fixed in order for EM to converge 
(this is the rescaling step in the last line of the pseudo-code). 
Our strategy is, after each EM step, to fix the norm of each 
column of A to unity and to adjust the corresponding power 
spectra accordingly. This is an arbitrary choice which hap¬ 
pens to work well in practice. 


domains) have a very small effect on the criterion. In or¬ 
der to reach the true minimum of 0(^), it appears necessary 
to complement EM with another minimization technique. 
The strategy is to use the straightforward EM algorithm 
to quickly get close to the minimum of 0(^) and then to 
complete the minimization using a dedicated minimization 
algorithm. This complementary algorithm can use a simple 
design thanks to the good starting point provided by EM. 

The spectral mismatch criterion (12) can, in theory, 
be minimized by any optimization algorithm. However, the 
same effect which slows down EM in its final steps also 
makes the minimization of the mismatch criterion (12) dif¬ 
ficult for any algorithm. In particular, simple gradient algo¬ 
rithms are unacceptably slow. Actually, we found that even 
conjugate gradient techniques cannot overcome this prob¬ 
lem and had to resort to a quasi-Newt on method. We have 
used the classic BEGS (Broyden-Eletcher-Goldfarb-Shapiro) 
algorithm (Luenberger 1973). This technique minimizes an 
objective function by successive one-dimensional minimiza¬ 
tion (line searches). At each step, the direction for the line 
search is the gradient ‘rectified’ by the inverse of Hessian 
matrix. The BEGS technique is a rule to update an esti¬ 
mate of the inverse Hessian matrix at low computational 
cost. 


4 TESTING AND PERFORMANCE 

We now turn to illustrating the applications and perfor¬ 
mance of our multi-detector multi-component spectral- 
matching method on a simple set of synthetic observations: 
three-component noisy linear mixtures featuring contribu¬ 
tions from GMB anisotropies, dust emission, and SZ ther¬ 
mal emission. Unbiasedness and statistical uncertainties are 
investigated by a Monte-Garlo technique. 

Five implementations of the method for different appli¬ 
cations will be discussed: 

(i) a multi-component spatial power spectrum estimation 
assuming the mixing matrix is known, 

(ii) a blind approach in which spatial power spectra, noise 
levels, and the emission laws of components are jointly esti¬ 
mated on the data, 

(iii) a semi-blind approach where GMB and SZ emission 
laws are assumed to be known, and the emission law of the 
dust component (in addition to spatial power spectra and 
noise levels for all components) is estimated from the data, 

(iv) an application for detector cross-calibration, 

(v) a Wiener-filter component separation using parame¬ 
ters estimated via blind spectral matching. 

Finite beam effects are neglected for the present work, al¬ 
though they are not a fundamental limitation for our method 
(see sec. 5). For definiteness, we also assume here that the 
noise is white, although this assumption can be relaxed as 
well if needed. 


3.3 Non linear optimization 

When applied to our data, the EM algorithm shows fast 
convergence in a first phase and then enters a second phase 
of slower convergence. This is due to the fact that some pa¬ 
rameters {e.g. sub-dominant power spectra in some spectral 


4.1 Simulated data 

Synthetic observations in six frequency bands identical to 
those of the Planck HFI are generated on 300 x 300 pixel 
maps corresponding to a 12.5° x 12.5° field located at high 
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CHANNEL 1 : 100 GHz 


CHANNEL 2 ; 150 GHz 


CHANNEL 3 : 217 GHz 



CHANNEL 4 ; 353 GHz 


CHANNEL 5 : 545 GHz 


CHANNEL 6 : 857 GHz 


Figure 1. Simulated observations for six frequency bands 



100 

143 

217 

353 

545 

857 

CMB 

0.889 

0.926 

0.896 

0.275 

0.0019 

1.3X10-'^ 

dust 

9X 10“^ 

6X 10-4 

0.0082 

0.215 

0.687 

0.938 

SZ 

0.0064 

0.0032 

2x 10-"^ 

0.0044 

0.00019 

5.2X 10“® 

noise 

0.102 

0.0727 

0.108 

0.536 

0.320 

0.0667 


Table 1. Fraction of the power in each of the components 


galactic latitude. For each mixture realisation, synthetic 
components and noise are obtained as follows: 

• The CMB component is a COBE-normalised, randomly 
generated realization of CMB anisotropies obtained using 
the spatial power spectrum predicted by the CMB- 
FAST software (Zaldarriaga & Seljak 2000) with Hq — 65 
km/s/Mpc, Qm = 0.3, Q.h — 0.045, A = 0.7. 

• The galactic dust emission template is obtained from 
the 100 /xm IRAS data in the sky region located around 
a = 204° and 6 = 11°. Bright stars are removed using a 
point source extracting algorithm. Residual stripes are cut 
out by setting to zero the contaminated Fourier coefficients. 
The Fourier modes suppressed in this way are randomly re¬ 
generated with a distribution obtained, for each mode, from 
the statistics of the other modes at the same scale in the 
IRAS map. This method preserves the (assumed) statistical 
azimuthal symmetry and general shape of the spatial spec¬ 
trum. 

• The thermal Sunyaev-Zel’dovich template is drawn at 
random from a set of 1500 SZ maps generated for this pur¬ 
pose using the software described in (Delabrouille et al. 
2002 ). 

• White noise at the level of the nominal per-channel 
Planck HFI values is added to the observations. 

Synthetic observations are displayed in fig. 1. The gen¬ 
eral common pattern which can be seen in the lowest fre¬ 
quency channels is simulated CMB anisotropies, whereas 
the pattern of emission of interstellar dust as observed with 
IRAS dominates our 857 and 545 GHz maps. The contri¬ 
bution of the SZ effect, very sub-dominant, is not obviously 
visible on these maps. 


Table 1 gives, for each channel, the relative power of 
all components and of noise for a typical synthetic mixture 
(here ‘relative’ means: the sum of all powers is normalised 
to unity). Typical input templates for the three components 
can be seen in figure 6, left column. 


4.2 Application 1: Spectral estimation 

The first application is the estimation of component spa¬ 
tial power spectra. It is assumed that the mixing matrix is 
known, but that the noise level for each map is not known 
precisely. The set of parameters to be estimated from the 
data then is ^ = {C'j(^), cr^}. 

Component spectra are estimated on 32 ring-shaped do¬ 
mains for 5,000 different mixtures. The first 30 domains are 
equally spaced rings covering the lowest 60% of the spa¬ 
tial frequencies (0 < ^/^max < 0.6), and the remaining two 
cover respectively 0.6 < ^/^max < 0.8 and 0.8 < ^/^max < 1. 
This choice of spectral domains is adapted to the assumed 
azimuthal symmetry of the spectra by the choice of ring- 
shaped domains, and has a large number of rings in the re¬ 
gion where the signal is strong and where information from 
source spectra is relevant. 

The result of the estimation of the spatial power spec¬ 
trum of the three components in the relevant frequency 
range is shown in fig. 2. Errors on estimated spectra are ob¬ 
tained from the dispersion over the 5,000 distinct simulated 
observations. For the SZ effect, the spatial power spectrum 
is averaged into larger bins after parameter estimation to 
reduce the scatter of the measurements. The figure shows 
that, as expected, a low-variance unbiased power spectrum 
is obtained for all components without explicit separation of 
the observations into component maps. For the CMB, the 
measurement is sample (cosmic) variance limited at small 
spatial frequencies. Such an effect does not appear on the 
dust spectrum estimate because we use only one dust map 
in the Monte-Carlo. 


4.3 Application 2: Blind parameter estimation 

Let us now assume that the exact emission laws of all com¬ 
ponents are unknown. Then the full parameter set, to be 
estimated from the data, is ^ = {Cj(g), cr^. A}. Again, we 
estimate parameters on 5,000 different simulated data sets. 
For each run, the scale indetermination between mixing ma¬ 
trix columns and component power spectra is fixed by renor¬ 
malising to the true value of A at a single reference frequency 
(100 GHz for the CMB and thermal SZ effect, and 857 GHz 
for the dust). Error bars (Tier) for all parameters are com¬ 
puted from the distribution of the estimates over all simu¬ 
lated observations. 

Figure 3 displays recovered emission spectra (diamonds 
with 1(7 error bars) as compared to exact emission spectra 
(solid lines). Emission laws of all components are recovered 
with no significant bias. The CMB emission law is recov¬ 
ered very accurately at all frequencies except 857 GHz. The 
dust emission law is recovered quite accurately at high fre¬ 
quencies, less accurately at frequencies where it is very sub¬ 
dominant. The SZ effect emission shape, sub-dominant at 
all frequencies, is recovered with larger relative error bars. 
Because of the renormalisation, error bars for CMB and SZ 
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Figure 2. This figure shows the recovered spatial power spectrum (crosses) compared to the exact band-averaged spectra (solid 

lines) for CMB (left), dust (middle), and Thermal SZ effect (right). These results correspond to a non-blind MDMC spectral estimation 
in which the mixing matrix A is known. Vertical bars show the Icr errors, and horizontal bars the spatial frequency range of each bin. 


channel 

100 

143 

217 

353 

545 

857 

RMS est. xlO-® 

(29.1 ± 0.22) 

(18.7 ± 0.13) 

(12.85 ± 0.09) 

(11.92± 0.07) 

(8.98 ± 0.05) 

(4.97 ± 0.06) 

RMS true xlO"® 

29.11 

18.70 

12.86 

11.93 

8.980 

4.970 


Table 2. Comparison of true and estimated noise levels (RMS). The errors are obtained from the dispersion of results obtained using 
10,000 different mixtures. 



Figure 3. The figure shows the recovered emission laws of the 
components (diamonds) compared to the exact emission laws used 
in the simulations (solid lines). The errors are computed from the 
dispersion of the recovered values for 10,000 different synthetic 
mixtures. 


vanish at 100 GHz, and the dust emission law error bar van¬ 
ishes at 857 GHz. 

Spatial power spectra, in turn, are also estimated. As 
shown in figure 4, GMB and dust spatial power spectra are 
recovered with good accuracy and no significant bias, almost 
as well as for the non-blind spectral estimation. The SZ 
power spectrum is also significantly constrained, although 
error bars are significantly larger than in the non-blind spec¬ 
tral estimation. 


Finally, table 2 shows the estimates of the noise RMS 
as compared to true levels. Relative errors are below 2.5 % 
for all channels. 


4.4 Application 3: Semi-blind parameter 
estimation 

In our particular case, the emission laws of the GMB and of 
the SZ are known to almost perfect accuracy. Assume, how¬ 
ever, that measuring the dust emission law is of particular 
interest. How much do we gain by forcing known emission 
laws to their true value, and estimating only the unknown 
dust emission spectrum? 

We repeat the simulations described in 4.3, now fix¬ 
ing two columns of the mixing matrix, and estimating the 
third one (in addition to domain-averaged spatial spectra 
and noise levels). Table 3 compares quantitatively the rel¬ 
ative errors on the resulting dust emission law. At low fre¬ 
quency (between 100 and 217 GHz), the accuracy of the 
estimation is improved by a factor of 2 to 3. At 353 GHz, 
the improvement is still noticeable, but at 545 GHz, where 
the dust emission begins to dominate, the blind and semi¬ 
blind approaches give similar errors. The use of partial prior 
information on the mixing matrix A is thus useful here to 
improve the estimation of the entries of A which contribute 
little relative power to the observations. 

In addition to this substantial improvement in estimat¬ 
ing the unknown ‘dust column’ of A, the semi-blind ap¬ 
proach is more efficient for estimating the SZ power spec¬ 
trum than the full blind implementation. Figure 5 shows 
the comparison of the quality of spectral estimation in the 
blind and semi-blind approaches relative to the non-blind. 
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spatial frequency (deg ') 


Figure 4. The figure —similar to fig. 2 but for blind spectral matching— shows the recovered power spectra of the components compared 
to exact ones (solid lines). The errors are computed from the dispersion of the recovered values for 5,000 different synthetic mixtures. 


channel 

100 

143 

217 

353 

545 

857 

true dust em spectrum 

0.3071 

0.5902 

1.2177 

2.6106 

4.5371 

6.4288 

relative error, blind approach 

6.229 

2.469 

0.634 

0.0662 

0.00790 

no values 

relative error, semi-blind approach 

2.623 

1.056 

0.285 

0.0368 

0.00725 

no values 


Table 3. Relative errors on dust emission law estimation. In the first case, all the elements of the mixing matrix are estimated (blind 
approach). In the second case, the columns of the mixing matrix which corresponds to the CMB and the thermal SZ components are 
fixed (semi-blind approach). Although the semi-blind approach does not improve significantly the determination of the dust spectrum at 
545 GHz, the improvement is very significant (factors of two to three) at other frequencies. 


To the precision of our Monte-Carlo tests (1-2% level on er¬ 
ror bars), the semi-blind result is as accurate for this partic¬ 
ular mixture as the non-blind estimate, significantly better 
than the blind result. As the semi-blind and the non-blind 
estimates give similar results, however, the actual enhance¬ 
ment in precision depends on details of the mixture and 
parametrization. 

This comparison, however, shows that it is in general 
useful to exploit as much as possible reliable prior informa¬ 
tion. Our method is flexible enough to do so. 

4.5 Application 4: Detector calibration 

The mixing matrix A depends not only on components 
(through emission spectra), but also on detectors (through 
frequency bands and optical efficiency). Mixing matrix co¬ 
efficients Adj^ expressed in readout (rather than phys¬ 
ical) units can be approximated by the product of a 
detector-dependent calibration coefficient aa and an emis¬ 
sion law ej{iy): 

Adj ~ OLdejifUd) (15) 

where Vd is the central observing frequency of detector d. 
Used on a data set from detectors observing in the same 
frequency band, the estimation of A for any astrophysical 
component gives relative calibration coefficients between de¬ 
tectors. If in addition the emission law of at least one of 
the components is known (e.g. CMB anisotropies), the esti¬ 
mation of the mixing matrix provides a relative calibration 
across frequency bands. Finally, if among the components 


there is one with known emission spectrum and known am¬ 
plitude (or known spatial power spectrum), absolute calibra¬ 
tion can be obtained in the same way. For instance, it is not 
excluded that in the not-so-far future, a high resolution ex¬ 
periment dedicated to a wide-field point source survey in the 
millimeter range can be calibrated on CMB anisotropies(!). 

4.6 Application 5: Component separation 

The separation of astrophysical components by some kind 
of inversion of the linear system of equation 1 has been the 
object of extensive previous work. Popular linear methods 
are listed in appendix A. In a Gaussian model, the best in¬ 
version is obtained by the Wiener filter. This filter, however, 
requires the prior knowledge of the mixing matrix A, com¬ 
ponent spatial power spectra, and noise power spectra. As 
discussed by (Cardoso et al. 2002), our spectral-matching 
method yields all the parameters needed to implement a 
Wiener-based component separation on maps. 

We compare the quality of component reconstruction 
using either the estimated parameter set 6 — {Cj(^), cr^. A} 
or ‘true’ best-knowledge values. 

Reconstructed maps. Figure 6 illustrates the quality of 
map reconstruction by Wiener inversion. The first column 
displays the input components, the second column shows 
components recovered with the exact Wiener filter (com¬ 
puted from the true mixing matrix, ensemble averages of 
the noise, ensemble averages of CMB and SZ power spec¬ 
tra, and a fit of the spatial power spectrum of the dust 
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CMB SPATIAL SPECTRUM ESTIMATION 



DUST SPATIAL SPECTRUM ESTIMATION SZ SPATIAL SPECTRUM ESTIMATION 




Figure 5. The figure shows the comparison of the quality of the blind and semi-blind power spectrum estimation of the components. 
The solid line displays the ratio between the size of the Icr error in the semi-blind and in the non-blind spectral matching, showing that 
they are comparable to within simulation accuracy. In contrast, the dotted line shows the ratio between the size of the Icr error in the 
blind and in the non-blind spectral matching, showing that some accuracy is lost when all components of the mixing matrix are adjusted 
as additional parameters. 


template). The third column displays the components re¬ 
covered by Wiener inversion using estimated parameters. In 
both cases, CMB and dust emissions are recovered satisfac¬ 
torily, but the SZ effect —strongly peaked and hence poorly 
suited to processing in Fourier space— remains noisy. Visu¬ 
ally, both methods perform about as well. 

Contamination levels. The quality of the separation can 
be assessed by a measure of contamination levels, i.e. how 
much of the other components gets into a component’s map 
after separation. 

The Wiener matrix, W = , 

obtained with exact values of A, Rn and Rs, differs slightly 
from its estimate VF, computed with estimates A, Rn and 
Rs(q)- Not only Rs(q) differs from Rs because it is an es¬ 
timate, but also because it is a flat band-power approxima¬ 
tion. 

At each frequency, off-diagonal terms ofWA correspond 
to leakage of other components into one component’s esti¬ 
mate at spatial frequency k. Each panel of figure 7 refers to 
one component (CMB, dust and SZ), and shows the relative 
contribution of all components and of noise to the recovered 
map. Levels are relative to the true map, so that the contri¬ 
bution of a component to its own recovered map illustrates 
the spatial filtering induced by the Wiener inversion. The 
figure illustrates that the inversion done with blindly esti¬ 
mated parameters performs almost as well as the separation 
using exact values of the spectra and mixing matrix. Differ¬ 
ences are typically much smaller than noise contamination, 
which is comparable with the blind and the non-blind ap¬ 
proaches. 


5 DISCUSSION 

5.1 Related work on component separation 

Explicit component separation has been investigated first 
in CMB applications by Tegmark V Efstathiou (1996), 
Bouchet V Gispert (1999), and Hobson et al. (1998). In these 


applications, all the parameters of the model (mixing ma¬ 
trix, noise levels, statistics of the components, including the 
spatial power spectra) are assumed to be known. 

Recent research has addressed the case of an imperfectly 
known mixing matrix. It is then necessary to estimate it (or 
at least some of its components) directly from the data. Eor 
instance, Tegmark et al. assume power law emission spectra 
for all components except CMB and SZ, and fit spectral 
indices to the observations (Tegmark et al. 2000). 

More recently, it has been proposed to resort to ‘blind 
source separation’ or ‘independent component analysis’ 
(ICA) methods. The work of Baccigalupi et al. (2000), fur¬ 
ther extended by Maino et al. (2002) implements a blind 
source separation method exploiting the non-Gaussianity 
of the sources for their separation. This infomax method, 
unfortunately, is not designed for noisy mixtures and can 
not deal with a frequency-dependent beam. 

The idea to use spectral diversity and an EM algorithm 
for the blind separation of components in CMB observations 
was proposed first by Snoussi et al. (2001). This approach ex¬ 
ploits the spectral diversity of components as in our MDMC 
spectral matching, but assumes the prior knowledge of the 
spatial power spectra of the components. Our approach ex¬ 
tends further on this idea, with a lot more flexibility, and the 
new point of view that spatial power spectra are actually the 
main unknown parameters of interest for CMB observations. 

Other reports of blind component separation in astro¬ 
nomical data include Nuzillard V Bijaoui (2000) and Eunaro 
et al. (2001). 

5.2 Comments on the spectral matching approach 
Robustness 

Our approach assumes that the data are collected in the 
form of a linear mixture of a known number of components 
that are independent, have different spatial power spectra, 
and different laws of emission as a function of frequency. 
These assumptions are valid in the three-component mix- 
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Figure 6. Left: true templates used as inputs. Middle: templates recovered using the ‘true’ Wiener filter. Right: templates recovered 
using the blind separation. 


tures used in our simulations. Applying this method to real 
data obtained with the Archeops experiment (Benoit et al 
2002a) gave us the opportunity to test that the method is 
quite robust, with satisfactory performance even when the 
noise is not white nor stationary, and when some residual 
systematic effects remain in the data. Of course, the exact 
impact of large departures from the model remains to be 
tested on a case by case basis. 

Detector—dependent beams 

It is quite usual in CMB observations that, because of 
the diffraction limit, the resolution of the available maps de¬ 
pend a lot on frequency. For Planck, the resolution ranges 


from about 30 arc-minutes at 30 GHz to 5 arc-minutes at 
350 GHz and higher. It is mandatory that a method com¬ 
bining all observations can benefit from the full resolution 
of the highest frequency channels. MDMC spectral match¬ 
ing, being implemented in Fourier (or spherical harmonics) 
space, permits to take beam effects into account straight¬ 
forwardly by including in the model the effect of a transfer 
function. 

Identifying components 

In practice, MDMC spectral matching runs with a fixed 
number of components. This number might not be well 
known (or even not very well defined), and must be guessed 
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RECOVERED CMB RECOVERED DUST 




spatial frequency (deg ') spatial frequency (deg 


REOOVERED SZ 



Figure 7. Contributions to the output map as a function of spatial frequency, relative to the true level of that component map. The 
left panel is for CMB, the middle panel for dust, and the right panel for the SZ effect. Results obtained with exact values of the mixing 
matrix, the spectra and noise levels are plotted as plain lines, and results obtained with the Wiener implementation using estimated 
parameters as diamonds. 


(or assumed). For CMB applications, an educated guess can 
be made (as usual for all component separation methods). 

A practical way to handle this issue consists in ap¬ 
plying the method several times with a increasing number 
of expected components. Comparing successive results per¬ 
mits to identify ‘stable’ components, which remain essen¬ 
tially unchanged when more components are sought. Too 
few components result in unsatisfactory identification and 
poor adjustment of the model to the empirical spectrum. 
Too many components results in the separation of artificial 
components, either very weak, or single detector noise maps. 

With this strategy, the method can be seen as a com¬ 
ponent discovery tool, which can be useful in particular to 
uncover and separate out instrumental effects behaving as 
additional components. 

Connected to the issue of component identification is 
the uniqueness (or identifiability) problem. As discussed 
above, MDMC spectral matching uses spectral diversity as 
the ‘engine’ of blind separation: components with propor¬ 
tional spatial power spectra (or nearly so) are not (or poorly) 
separated. In the current test, the three components are dif¬ 
ferent enough that no such problem arises. In richer mix¬ 
tures, containing contributions from several galactic com¬ 
ponents, it is quite possible that spectral diversity does not 
hold. If, for instance, several galactic components have a spa¬ 
tial power spectrum proportional to l/k^, the method would 
satisfactorily estimate parameters relevant to the CMB and 
the SZ effect, but is unable to unmix galactic contributions. 
A way out is to use a semi-blind approach in which some 
entries of the mixing matrix are forced to zero when the con¬ 
tribution of a particular component at a particular frequency 
is known to be negligible. This is the object of forthcoming 
research. 


5.3 Comments on the Wiener inversion 

After adjusting the parameters of the model to the data, 
the recovered mixing matrix, spectra, and noise levels can 
be used for component separation by Wiener inversion. 

Quite interestingly, the Wiener filter can be imple¬ 
mented for identified components even if some sub-mixtures 
are not identified (for instance by lack of spectral diversity). 
It can be shown straightforwardly that the Wiener form: 

W = (16) 

can be rewritten equivalently as: 

W = RsA'f[ARsA'f + RNr^ (17) 

or 

W = RsA^rC (18) 

Thus, the Wiener inversion for component j requires only 

an estimate of Ry (readily available as Ry), of the spatial 
power spectrum of component j, and of the column of the 
mixing matrix A corresponding to component j. Therefore, 
it is not necessary to identify all components, nor to know all 
spatial power spectra, nor to know noise levels, to separate 
the CMB from the other components. We just need to know 
the CMB emission law (which we do) and its spatial power 
spectrum (which can be estimated blindly with our method). 

As a final note, we stress that the Wiener method 
has the property of filtering the data spatially - an un¬ 
pleasant fact when power spectra are estimated on sepa¬ 
rated maps. In contrast, MDMC spectral matching adjusts 
domain-averaged spatial power spectra on the data prior to 
component map separation (bypassing the need for power- 
spectrum estimation on output maps). 

5.4 Comments on spectral estimation 

In the above discussion, we have assumed for simplicity that 
the noise is spatially white for all detectors. This assump- 
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tion, however, can be relaxed if needed, without (in general) 
loosing identifiability. 

If the noise is uncorrelated between detectors, noise 
terms appear only on the diagonal of the multivariate 
power spectrum of the observations Ry • Off-diagonal terms 
contain only contributions from the off-diagonal terms of 
ARsAh If noise power spectra are completely free, off- 
diagonal terms of Ry constrain ARsA\ and diagonal terms 
serve to measure Rn ■ 

For instance, if the mixing matrix A is known, it is 
possible to adjust simultaneously the spatial power spectrum 
of the components and that of the noise on the data, as long 
as enough observations are available which is generically the 
case. 

If data from several experiments are analyzed jointly, 
however, no correlated noise of instrumental origin is ex¬ 
pected between data from detectors belonging to different 
experiments. This provides strong consistency checks, which 
ultimately provides an additional handle on the assessment 
of errors in the final results. 

With a MDMC approach in Fourier (or spherical har¬ 
monic) space, data at different frequencies and with different 
beam sizes can be analyzed jointly. This joint analysis can 
be done straightforwardly by stacking all observations from 
different instruments in the same vector of observations y, 
as long as they cover the same area of the sky. This is bound 
to become of major importance for the future scientific ex¬ 
ploitation of multi-scale and multi-frequency data. 


5.5 Using single detector maps 

For a well calibrated instrument, the linear mixture model 
can be written in physical units, and the mixing matrix A 
depends only on the emission laws of components. Tradition¬ 
ally then, component separation is implemented on a set of 
maps per frequency channel (data from all detectors in each 
single frequency channel are combined into a single map). 
This approach should be preferred if good maps cannot be 
obtained independently for each detector (for sampling rea¬ 
sons, or because of striping...), and if all detector data at 
the same frequency can be combined (with some optimality) 
into one single map. 

An alternate solution, when calibration coefficients and 
noise properties for individual detectors (levels, correlations 
between the noise of different detectors) are not known pre¬ 
cisely, is to estimate parameters directly using single detec¬ 
tor maps in readout units (e.g. microvolts), which can be 
done naturally with our spectral-matching method. 


5.6 Comment on domain averaging 

We have considered band-averaged spectra as in defini¬ 
tion (3). In CMB studies, one may be more interested in 
quantities like £(£+l)Cj(£) which are expected to vary more 
slowly than C(£) itself. In this case, it may be more appro¬ 
priate to perform bin averages as 

^Y(g) = [ ^ ^(^ + 1) j ^ £(£ + 1) Y(£)Y(£y. (19) 


Spectral matching on such statistics would then yield esti¬ 
mates of 

Cj{q) = ( I] + 1 ) J ( 20 ) 

Xie'Dq / ie'Dq 

This weighted band-averaging can be used in our MDMC 
spectral-matching method as well. 


6 CONCLUSION 

This paper describes a spectral matching method for blind 
source identification in noisy mixtures. The method adjusts 
a simple model of the data to the observations. We estimate 
a physically relevant set of parameters (fundamental param¬ 
eters of the model: the mixing matrix, domain-averaged spa¬ 
tial power spectra of the sources and of noise) by maximum 
likelihood. Only unknown parameters are estimated, as the 
method lends itself easily to the modifications necessary to 
exploit partial prior information. Thanks to a Gaussian sta¬ 
tionary model, the likelihood depends only on a reduced set 
of statistics (average spectral density matrices of the obser¬ 
vations). An efficient, dedicated algorithm can adjust the 
parameters in just a few minutes on a modest workstation. 

Our method is of particular relevance for CMB data 
analysis in a multi-detector, multi-channel mission as 
Planck. 

First, the method permits the blind separation of un¬ 
derlying components, hence, of emissions coming from differ¬ 
ent astrophysical sources. Obtaining clean maps of emissions 
due to distinct astrophysical processes is crucial to under¬ 
standing their properties. 

Second, the blind method permits to estimate the num¬ 
ber of components (by repeating the adjustment with a 
varying number of sources). This will be of utmost impor¬ 
tance for analysing data from sensitive missions as Planck, 
in particular for the identification and characterisation of 
sub-dominant processes of foreground emission (e.g. free- 
free emission, non-thermal dust emission), or to track down 
systematic effects in the data. 

Third, the blind method can estimate the entries of 
the mixing matrix. This permits, if needed, to constrain 
the emission law (electromagnetic spectrum) of the different 
components contributing to the mixture, which is essential 
for understanding their physical properties and possibly the 
emission processes. 

Fourth, if strong sources, for which the mixing matrix 
is well recovered, contribute to the mixture, the method can 
provide a useful tool for the inter-calibration (or the abso¬ 
lute calibration) of the different detectors or of the different 
channels. 

Fifth, as our method is essentially a spectral matching 
method, which adjusts the spectra of a number of compo¬ 
nents to the observational data, it provides a direct mea¬ 
surement of the spatial power spectrum of the components 
in the mixture, of particular relevance for the CMB. 

As a final word, let us emphasize that the method can 
be applied to sets of data coming from different experiments. 
As the MDMC spectral matching approach, implemented in 
Fourier space, permits straightforwardly to account for beam 
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effects, it permits also to analyze jointly and blindly multi¬ 
experiment, multi-channel, multi-detector, multi-resolution 
data as long as they cover the same area of the sky. The 
method may become an essential tool for mapping and an¬ 
alyzing sources of emission observed with present and up¬ 
coming sub-millimeter experiments. 
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APPENDIX A: LINEAR COMPONENT 
SEPARATION 

The separation of astrophysical components relies on the 
key assumption that the total sky emission at frequency u 
is a linear superposition of a number of components as in 
equation 1. In principle then, the observation of the sky 
emission at several frequencies (i^i, z^ 2 , • • •) permits to recover 
estimates of the component templates Sj(0,(t)) by 

inverting equation 1. There are several methods for a linear 
inversion of the system when the mixing matrix A is known: 

(i) If there are as many noiseless observations as there are 
astrophysical components contributing to the total emission, 
by simple inversion of the square matrix A, so that the re¬ 
covered components, are given by S' = A~^Y; 

(ii) If there are more observations than astrophysical 
components, the system can be inverted using the pseudo 
inverse, S = [A'^A]~^A'^Y; 

(iii) For optimal signal to noise ratio under Gaussian 
statistics, without other prior assumption on the astrophys¬ 
ical con^onents, one can use a generalized least square so¬ 
lution, S = [A^A]~^A^R^^Y , where Rn is the noise 
correlation matrix; 

(iv) The choice S = = WY is 

the Wiener solution. It is the linear solution which minimises 
the variance of the error, but requires the knowledge of both 
the noise autocorrelation, Rn, and of the component auto¬ 
correlation, Rs. As < 1, this solution modifies the 

spatial spectra of the components since different weights are 
given to different spatial frequencies of a component map. 

(v) The renormalised Wiener solution, S = AfFT, where 

A = [diag(fFA)]“^, is the Wiener solution under the con¬ 
straint = 1. This solution renormalises the Wiener 

solution at each spatial frequency, so that no spatial filtering 
is applied to the data. 

In the above list, solution 1 is the special case of 2 when 
A is square and regular, 2 is the special case of 3 when the 
noise is white (Rn oc Id), 3 the special case of 4 when the 
signal is much stronger than the noise, and 5 a constrained 
version of 4 that does not modify the relative importance 
of different spatial frequencies in a component map after 
inversion. Depending on the method chosen, one or more of 


A, Rn and Rs (which can be considered as parameters of 
the model) is needed to implement the inversion. 

Realizing the fact that optimal component separation 
requires the prior knowledge of a set of parameters of the 
model is one of the driving ideas of our MDMC spectral- 
matching approach: we implement the joint estimation of 
all such parameters that are not necessarily known a priori. 


APPENDIX B: SPECTRAL MATCHING AND 
LIKELIHOOD 

This section shows that minimizing the spectral matching 
criterion (12) is equivalent to maximizing the likelihood of 
a simple model. 

Gaussian likelihood and covariance matching 

We first show how criterion (12) is related to a Gaussian 
likelihood. If y is a real n x 1 zero mean Gaussian random 
vector with covariance matrix R, then 

-2\ogp(y) = + logdet(27rR). (Bl) 

If y = [yi,..., ^t] is an n X T matrix made of T such vectors, 
independent from each other, with Cov(yt) = Rt, then 

T 

-2logp{Y) = '^ylRi^yt + logdet(2-7rJ?t) (B2) 

t=l 

Assume further that the index set [1,...,T] can be de¬ 
composed in Q subsets /i,..., /q such that Rt is constant 
with value R(q) over the qth subset, that is, Rt = R(q) if 
t E Iq. Then, eq. (B2) can be rewritten, using y'^R~^y = 
tr (^R~^yy^^ as 

Q 

-21ogp(y) = iTR(q) (^R(qy^) + logdet(R(g)) + cst 

g=l 

where R(q) = '^tei number of indices 

in Iq. This last expression also reads 

Q 

-21ogp(y) = '^nqD (Ry{q),Ry{qjj + cst (B3) 

9=1 

where the constant term is a function of the data Y via 
Ry(q) but not of any R(q). This form makes it clear that 
the mismatch (12) corresponds to the log-likelihood of a se¬ 
quence of zero mean Gaussian vectors which are modeled as 
having block-wise identical covariance matrices. 

Whittle approximation 

The statistical distribution of the Fourier coefficients 
of a stationary time series is a well researched topic. If T 
samples y(l),..., y(T) of an n-variate discrete time series 
are available, the Fourier transform is: 

1 

y{f) = X y{t) exp -2m ft. (B4) 

^ t=0 

For a stationary time series with spectral covariance matrix 
R(f), simple asymptotic (for large T) results are available. 
In particular, the Whittle approximation consists in approx¬ 
imating the distribution of the Fourier transform y(f) at 
DFT points / = g/T as follows: 
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• The real part and the imaginary part of y(f) are Gaus¬ 
sian, uncorrelated, with the same covariance matrix and 

= Rif)- 

• For 0 < p ^ p' < T/2 (assuming T even and for 
integers), y{p/T) is uncorrelated with y{p'jT). 

This is a standard approximation: it has been used for the 
blind separation of noise free mixtures of components by 
Pham & Garat (1997) and in the context of astronomi¬ 
cal component separation by e.g.Bouchet & Gispert (1999); 
Tegmark & Efstathiou (1996). 

Expression (B3) thus shows^ that the minimization 
of (12) is equivalent to maximizing (the Whittle approxi¬ 
mation to) the likelihood provided we model the spectra of 
the sources as being constant over spectral domains. 


APPENDIX C: AN EM ALGORITHM IN THE 
SPECTRAL DOMAIN 


The Expectation-Maximization (EM) algorithm (Dempster 
et al. 1977) is a popular technique for computing maximum 
likelihood estimates. This section first briefly reviews the 
general mechanism of EM and then shows its specific form 
when applied to our model. 

The EM algorithm. Gonsider a probability model 
p{y, s\0) for a pair (y, s) of random variables with 0 a param¬ 
eter set. If the variable s is not observed, the log-likelihood 
of the observed y is 

1(0) = logp{y\e) = log Jpiy, s\e)ds (Cl) 

Eor some statistical models, the maximization of the log- 
likelihood 1(0) can be made easier by considering the EM 
functional: 

^') = J ^og{p{y, s|6>)) p{s\y, e')ds. (C2) 


The EM algorithm is an iterative method which computes 


a sequence of estimates according to: 


O(^) ^(^^+1) _ argmax/ 

(C3) 

It can be shown that 


1(0", O') > 1(0', O') ^ 1(0") > 1(0') 

(C4) 


meaning that every step of the algorithm can only increase 
the likelihood. Actually, a stationary point of the algorithm 
also is a stationary point of the likelihood since 


81 ( 0 ) 

80 


81(0,0') 

80 


e'=e 


(C5) 


The EM algorithm is an interesting technique for maximiz¬ 
ing the likelihood if i) the computation of the conditional 
expectation in definition (G2) (E step) is and ii) the maxi¬ 
mization (G3) of the functional (M step) are computation¬ 
ally tractable. 


^ Actually some care is required to deal with the fact that the 
Fourier coefficients are complex-valued and that y(—f) = y(f)'''■ 
This introduces some minor complications in the computations 
but does not affect the final result. 


Both the E step and the M step turn out to be straight¬ 
forward because one elementary EM step amounts to solv¬ 
ing: 

In our model, the partial derivative in (G6) turns out to 
be a simple function of y and s, allowing the conditional 
expectation to be easily computed and eq. (G6) to be easily 
solved. This is sketched in the following. 


A single Gaussian vector. In order to introduce the nec¬ 
essary notations, we start by considering a simple case where 
y = As + n where s and n are independent Gaussian vectors 
with zero-mean and covariance matrices equal to Rs and Rn 
respectively. Then the parameter set is ^ = (A, Rs, Rn) and 
one has 


-2\ogp(y\s,0) = (y- As)^R^^(y - As) + log \Rn\ + cst 

—21ogp(s|^) = + log |Rs| + cst 


Using p(y,s) = p(y\s)p(s), the log derivatives of the joint 
density with respect to the components of 0 are: 


= «,-■[(»-A,),<] (C7) 

(C8) 

(C9) 


Thus, in this simple model, computing the conditional ex¬ 
pectations as in eq. (G6) would boil down to evaluating the 
conditional expectations of the random variables ss\ sy\ 
ys^ and yy^. This is a routine matter in a Gaussian model 
y = As n for which one finds: 



= Wi0)yy^W{0y + 0(0) 

(CIO) 

E{sy^\y,0) 

= W{0)yy^ 

(Cll) 

E{ys^y,0) 

= yy^W{0)^ 

(C12) 

Eiyy^\y,0) 

= yy^ 

(C13) 


with the following definitions for matrices C(0) and W (0)\ 
(7(61) = {A^RpA + Rp)~^ (C14) 

Wie) = {A'^R-^A + Rpy^A^Rp (C15) 


Note that C(0) = CoY(s\y,0) and that W(0) is the Wiener 
filter, that is E(s\y,0) — W(0)y. 


The EM algorithm in the Whittle approximation 

In our model, according to the Whittle approximation, the 
DET points y(k) are independent so that the EM func¬ 
tional (G2) for the whole data set simply is a sum over 
DET frequencies of elementary functionals. Thus an EM 
step 0' ^ 0 consists in solving 

k 

To proceed further, eq. (G16) is specialized to the case of 
interest by using two ingredients. Eirst, we use the rela¬ 
tion y(k) = As(k) + n(k) and the Gaussianity of each pair 
(y(k), s(k)); this is expressed via eqs. (G7-G9). Second, we 
use the approximation that the power spectra are constant 
over each spectral domain. Gombining these properties, the 
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cancellation (C16) of the gradient with respect to A, Rn and 


each Rs{q) yields 

0 = Rys(0') - A{0)Rss(0') (C17) 

0 = Ryyie') - A{e)Rsyi0') - Rys{O')A{0)^ 

+A{e)Rss{0')A{ey - Rn (C18) 

0 = RsF0',q)-Rs{0,q) {q=l,...,Q) (C19) 

where we have defined the matrix 

Rss(0,q) = — V S(s(fc)s(fc)t| y{k),0) (C20) 

keT)q 

and its weighted average over all domains 
Q _ 

Rss{e) = ^'^ Rss{e,q). (C21) 

g=l 


The same definitions hold for Rsy{q,0) (resp. Ryy(q,0)) 
as an averaged conditional expectation of s{k)y{k)^ (resp. 
y(k)y(k)^) and Rsy(O) (resp. Ryy(O)) as its weighted aver¬ 
age over spectral domains. 

Equations (C17-C19) are readily solved for uncon¬ 
strained A, Rn and Rs(q)- Recall however that our model 
involves diagonal covariance matrices so that the actual pa¬ 
rameter set is (A, Cj(^), cr^)). This constraint, however, pre¬ 
serves the simplicity of the solution of the M step since it 
suffices to use the diagonal parts of the solutions of (C17- 


G19). 

Thus, the M step boils down to 


A 

= RyFe')R,s{0')-^ 

(C22) 

2 

CTi 


Ryy{0') - Rys{0')Rss{0'r"Rsy{0') 

(C23) 

ii 


= [Rss(0\q)]ii 

(C24) 


The E-step of the algorithm essentially consists in com¬ 
puting the conditional covariance matrices Rxx(^)- In this 
step again, the linearity and the Gaussianity of the model, 
together with the domain approximation, again provides us 
with significant computational savings. Indeed, matrices C 
and W defined at eqs. (C14) and (C15) are actually con¬ 
stant over each spectral domain so that the E-step is imple¬ 
mented by the following computations which directly stem 
from (C14-C15) and from eqs.(C10-C13) : 


C{q) 

— {A^RCA + Rs{q) 

(C25) 

W{q) 

= iA^RCA + RFq)-Y"A^RC 

(C26) 

Rss{q) 

= W{q)Ry{q)Wiqy + Ciq) 

(C27) 

Rsyiq) 

= W{q)Ry{q) 

(C28) 


Erom this, one easily reaches the EM algorithm as de¬ 
scribed at algorithm 1. The description of this procedure is 
completed by specifying the initialization, the rescaling of 
the parameters and the stopping rule, as briefly discussed 
next. 


Some comments on EM implementation 

Rescaling is required because, as noted above, the model 
is not completely identifiable: the spectral density matrices 
Ry are unaffected by the exchange of a scalar factor between 
each column of A and each component’s power spectrum. 
We have found that this inherent indetermination must be 
fixed in order for EM to converge. Our strategy is, after each 


EM step, to fix the norm of each column of A to unity and to 
adjust the corresponding power spectra accordingly. This is 
an arbitrary choice which happens to work well in practice. 

The algorithm is initialized with the following parame¬ 
ters. We take Rn to be diag(Ry) where Ry = ^Ry(q). 

This is a gross overestimation since it amounts to assume 
no signal and only noise. The initial value of A is obtained 
by using the Nc dominant eigen-vectors of Ry as the Nc 
columns of A. Again, this is nothing like any real estimate of 
A, but rather a vague guess in ‘the right direction’. Einally, 
the spectra Pi(q) are taken as to be the diagonal entries of 
A^Ry(q)A which would be a correct estimate in the noise 
free case if A itself was. This ad hoc initialization procedure 
seems satisfactory. Note that it is a common rule of thumb 
to initialize EM with overestimated noise power. 

Regarding the stopping rule, recall (from sec. 3.3) that 
the EM algorithm is only used ‘halfway’ to the maximum 
of the likelihood and maximization is completed by a quasi- 
Newton technique Eor this reason, there is little point in 
devising a sophisticated stopping strategy: in practice, the 
algorithm is run for a pre-specified number of steps (based 
on a few preliminary experiments with the data). 


REFERENCES 

Baccigalupi C., Bedini L., Burigana C., De Zotti G., Earusi 

A. , Maino D., Maris M., Perrotta E., Salerno E., Toffolatti 

L. , Tonazzini A., 2000, MNRAS, 318, 769 

Bennett G. L., Halpern M., Hinshaw G., Jarosik N., Limon 

M. , Mather J., Meyer S. S., Page L., Spergel D. N., Tucker 
G., Wilkinson D. T., Wollack E., Wright E. L., 1997, in 
American Astronomical Society Meeting Vol. 191. p. 8701 

Benoit A., Ade P., Amblard A., Ansari R., Aubourg E., 
Bargot S., Bartlett J., Bernard J.-P., Bhatia R. S., Blan¬ 
chard A., Bock J. J., Boscaleri A., Bouchet E. R., Bour- 
rachot A., Gamus P., Gouchot E., de Bernardis P., De¬ 
labrouille J., Desert E.-X., Dore O., Douspis M., Du- 
moulin L., Dupac X., Eilliatre P., Eosalba P., Ganga K., 
Gannaway E., Gautier B., Giard M., Giraud-Heraud Y., 
Gispert R., Guglielmi L., Hamilton J.-G., Hanany S., 
Henrot-Versille S., Kaplan J., Lagache G., Lamarre J.- 
M., Lange A. E., Macias-Perez J. E., Madet K., Maffei B., 
Maffei D., Magneville G., Marrone D. P., Masi S., Mayet 
E., Murphy J. A., Naraghi E., Nati E., Patanchon G., 
Perrin G., Piat M., Ponthieu N., Prunet S., Puget J.-L., 
Renault G., Santos D., Starobinsky A., Strukov I., Sudi- 
wala R. V., Teyssier R., Tristram M., Tucker G., Vanel 
J.-G., Vibert D., Wakui E., Yvon D., 2002, submitted to 
A & A letters 

Benoit A., Ade P., Amblard A., Ansari R., Aubourg E., 
Bartlett J., Bernard J.-P., Bhatia R. S., Blanchard A., 
Bock J. J., Boscaleri A., Bouchet E. R., Bourrachot A., 
Gamus P., Gouchot E., de Bernardis P., Delabrouille J., 
Desert E.-X., Dore O., Douspis M., Dumoulin L., Du¬ 
pac X., Eilliatre P., Ganga K., Gannaway E., Gautier 

B. , Giard M., Giraud-Heraud Y., Gispert R., Guglielmi 
L., Hamilton J.-G., Hanany S., Henrot-Versille S., Hris- 
tov V. V., Kaplan J., Lagache G., Lamarre J.-M., Lange 
A. E., Madet K., Maffei B., Marrone D., Masi S., Murphy 
J. A., Naraghi E., Nati E., Perrin G., Piat M., Puget J.-L., 




MDMC spectral matching for CMB data analysis 15 


Santos D., Sudiwala R. V., Vanel J.-C., Vibert D., Wakui 
E., Yvon D., 2002, Astroparticle Physics, 17, 101 
Bersanelli M., Mandolesi R., 2000, Astrophysical Letters 
and Communications, 37, 171 
Bouchet F. R., Gispert R., 1999, New Astronomy, 4, 443 
Cardoso J.-F., Snoussi H., Delabrouille J., Patanchon G., 
2002, in Proc. EUSIPCO VoL 1, Blind separation of noisy 
Gaussian stationary sources. Application to cosmic mi¬ 
crowave background imaging, pp 561-564 
de Bernardis P., Ade P. A. R., Bock J. J., Bond J. R., 
Borrill J., Boscaleri A., Coble K., Grill B. P., De Gasperis 
G., Farese P. C., Ferreira P. G., Ganga K., Giacometti M., 
Hivon E., Hristov V. V., lacoangeli A., Jaffe A. H., Lange 
A. E., Martinis L., Masi S., Mason P. V., Mauskopf P. D., 
Melchiorri A., Miglio L., Montroy T., Netterfield C. B., 
Pascale E., Piacentini F., Pogosyan D., Prunet S., Rao S., 
Romeo G., Ruhl J. E., Scaramuzzi F., Sforna D., Vittorio 
N., 2000, Nature, 404, 955 

Delabrouille J., Melin J., Bartlett J., 2002, in Chen L., Ma 
C., Ng K., Pen U., eds, ASP Conf. Ser. 257: AMiBA 2001: 
High-Z Clusters, Missing Baryons, and CMB Polarization 
Vol. CS-257, Simulations of Sunyaev-Zel’dovich maps and 
their applications 

Dempster A., Laird N., Rubin D., 1977, Journal of the 
Royal Statistical Society B, 39, 1 
Funaro M., Oja E., Valpola H., 2001, in Proceedings of 
the 3rd International Conference on Independent Com¬ 
ponent Analysis and Blind Signal Separation, IGA 2001, 
San Diego Artefact detection in astrophysical image data 
using independent component analysis, pp 49-54 
Hanany S., Ade P., Balbi A., Bock J., Borrill J., Boscaleri 
A., de Bernardis P., Ferreira P. G., Hristov V. V., Jaffe 
A. H., Lange A. E., Lee A. T., Mauskopf P. D., Netterfield 
C. B., Oh S., Pascale E., Rabii B., Richards P. L., Smoot 
G. F., Stompor R., Winant C. D., Wu J. H. P., 2000, 
Astrophysical Journal Letters, 545, L5 
Hobson M. P., Jones A. W., Lasenby A. N., Bouchet F. R., 
1998, MNRAS, 300, 1 

Lamarre J.-M., Ade P., Benoit A., de Bernardis P., Bock 
J., Bouchet F., Bradshaw T., Charra J., Church S., Cou- 
chot F., Delabrouille J., Efstathiou G., Giard M., Giraud- 
Heraud Y., Gispert R., Griffin M., Lange A., Murphy A., 
Pajot F., Puget J.-L., Ristorcelli L, 2000, Astrophysical 
Letters and Communications, 37, 161 
Luenberger D. G., 1973, Introduction to Linear and Non¬ 
linear Programming. Addison-Wesley, Reading, MA 
Maino D., Farusi A., Baccigalupi C., Perrotta F., Banday 
A. J., Bedini L., Burigana C., De Zotti G., Gorski K. M., 
Salerno E., 2002, MNRAS, 334, 53 
Nuzillard D., Bijaoui A., 2000, Astron. Astrophys. Suppl. 
Ser., 147, 129 

Pham D.-T., Garat R, 1997, IEEE Tr. SP, 45, 1712 
Snoussi H., Patanchon G., Maciaz-Perez J., Mohammad- 
Djafari A., Delabrouille J., 2001, in Proc. MAXENT 
Bayesian blind component separation for cosmic mi¬ 
crowave background observations 
Tegmark M., Efstathiou G., 1996, MNRAS, 281, 1297 
Tegmark M., Eisenstein D. J., Hu W., de Oliveira-Costa 
A., 2000, The Astrophysical Journal, 530, 133 
Zaldarriaga M., Seljak U., 2000, The Astrophysical Journal 
Supplement Series, 129, 431 



